For each question below, show code. Once you’ve completed things, don’t forget to to upload this document (knitted version please!) to Blackboard!
A few tips:
install.packages() and load them using library().? or help() if you’re unsure about a function25 points total + 2 points extra-credit
1. Use tidycensus to download 1. race/ethnicity (B03002) and 2. median household income for Baltimore City. Store this data in a new object. Choose which race/ethnicity you’d like to relate to income (Non-Hispanic Black and Non-Hispanic White work best). Which census tract has the highest percentage of your target race/ethnicity (and what is the percent) and which has the highest median household income (and how much is it?)? (5 points) Reminder: Since we will be mapping our data, make sure you include use geometry = TRUE in get_acs()
# use tidycensus to obtain tract data for total pop, non-hispanic white pop, and household median income
md_race_income_2019 <- get_acs(geography = "tract",
variables = c("non_his_white" = "B03002_003", # non-hispanic white population
"total_pop" = "B03002_001", # total population
"med_hh_inc" = "B19013_001" # Median household income
),
year = 2019,
state = c(24), # Maryland
county = c(510), # Baltimore City
survey = "acs5", # 5 year survey
geometry = TRUE, #to download for mapping
output = "wide")
## Getting data from the 2015-2019 5-year ACS
# create a new variable to append to our data representing percent white population per tract
md_race_income_2019$whitepct <- (md_race_income_2019$non_his_whiteE / md_race_income_2019$total_popE) * 100
# Get a general idea of values for each column to narrow down filtering for max function
head(md_race_income_2019)
## Simple feature collection with 6 features and 9 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -76.655 ymin: 39.22373 xmax: -76.54867 ymax: 39.3311
## Geodetic CRS: NAD83
## GEOID NAME non_his_whiteE
## 1 24510260202 Census Tract 2602.02, Baltimore city, Maryland 254
## 2 24510150100 Census Tract 1501, Baltimore city, Maryland 86
## 3 24510250303 Census Tract 2503.03, Baltimore city, Maryland 1117
## 4 24510180100 Census Tract 1801, Baltimore city, Maryland 5
## 5 24510020300 Census Tract 203, Baltimore city, Maryland 2855
## 6 24510250402 Census Tract 2504.02, Baltimore city, Maryland 1314
## non_his_whiteM total_popE total_popM med_hh_incE med_hh_incM
## 1 110 5681 637 41538 7943
## 2 80 2172 383 17371 10440
## 3 227 2229 274 38519 11718
## 4 8 2185 267 15195 5979
## 5 374 3761 416 108026 22478
## 6 283 4614 466 37451 12822
## geometry whitepct
## 1 MULTIPOLYGON (((-76.56596 3... 4.471044
## 2 MULTIPOLYGON (((-76.64501 3... 3.959484
## 3 MULTIPOLYGON (((-76.655 39.... 50.112158
## 4 MULTIPOLYGON (((-76.63413 3... 0.228833
## 5 MULTIPOLYGON (((-76.60051 3... 75.910662
## 6 MULTIPOLYGON (((-76.60817 3... 28.478544
md_race_income_2019 <- md_race_income_2019 %>%
filter(total_popE > 0)
# filter percent white pop to determine the max value
md_race_income_2019 %>%
filter(whitepct > 80) %>% # use values above 80% to narrow down our search
filter(whitepct== max(whitepct))
## Simple feature collection with 1 feature and 9 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -76.61192 ymin: 39.25471 xmax: -76.59241 ymax: 39.27298
## Geodetic CRS: NAD83
## GEOID NAME non_his_whiteE
## 1 24510240400 Census Tract 2404, Baltimore city, Maryland 2498
## non_his_whiteM total_popE total_popM med_hh_incE med_hh_incM
## 1 201 2721 190 110982 11576
## geometry whitepct
## 1 MULTIPOLYGON (((-76.61192 3... 91.80448
# filter medium household income to determine max value
md_race_income_2019 %>%
filter(med_hh_incE > 30000) %>% # use values above 30000 to narrow our search
filter(med_hh_incE== max(med_hh_incE))
## Simple feature collection with 1 feature and 9 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -76.62923 ymin: 39.33853 xmax: -76.60958 ymax: 39.35387
## Geodetic CRS: NAD83
## GEOID NAME non_his_whiteE
## 1 24510271102 Census Tract 2711.02, Baltimore city, Maryland 2724
## non_his_whiteM total_popE total_popM med_hh_incE med_hh_incM
## 1 440 4597 501 195156 48413
## geometry whitepct
## 1 MULTIPOLYGON (((-76.62923 3... 59.25604
2. Please reproject this data to Web Mercator. (1 points)
st_crs on initial variable to check the original projection: NAD83. I then used st_transform and stored the newly projected data in a new variable. I then used st_crs a second time on the new variable to ensure the reprojection to Web Mercator took place effectively.# check initial projection (should be NAD83)
st_crs(md_race_income_2019)
## Coordinate Reference System:
## User input: NAD83
## wkt:
## GEOGCRS["NAD83",
## DATUM["North American Datum 1983",
## ELLIPSOID["GRS 1980",6378137,298.257222101,
## LENGTHUNIT["metre",1]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## CS[ellipsoidal,2],
## AXIS["latitude",north,
## ORDER[1],
## ANGLEUNIT["degree",0.0174532925199433]],
## AXIS["longitude",east,
## ORDER[2],
## ANGLEUNIT["degree",0.0174532925199433]],
## ID["EPSG",4269]]
# use st_transform to reproject
md_race_income_2019_webmerc <- st_transform(md_race_income_2019, crs = 3857)
# check to make sure the transform worked
st_crs(md_race_income_2019_webmerc)
## Coordinate Reference System:
## User input: EPSG:3857
## wkt:
## PROJCRS["WGS 84 / Pseudo-Mercator",
## BASEGEOGCRS["WGS 84",
## DATUM["World Geodetic System 1984",
## ELLIPSOID["WGS 84",6378137,298.257223563,
## LENGTHUNIT["metre",1]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## ID["EPSG",4326]],
## CONVERSION["Popular Visualisation Pseudo-Mercator",
## METHOD["Popular Visualisation Pseudo Mercator",
## ID["EPSG",1024]],
## PARAMETER["Latitude of natural origin",0,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8801]],
## PARAMETER["Longitude of natural origin",0,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8802]],
## PARAMETER["False easting",0,
## LENGTHUNIT["metre",1],
## ID["EPSG",8806]],
## PARAMETER["False northing",0,
## LENGTHUNIT["metre",1],
## ID["EPSG",8807]]],
## CS[Cartesian,2],
## AXIS["easting (X)",east,
## ORDER[1],
## LENGTHUNIT["metre",1]],
## AXIS["northing (Y)",north,
## ORDER[2],
## LENGTHUNIT["metre",1]],
## USAGE[
## SCOPE["Web mapping and visualisation."],
## AREA["World between 85.06°S and 85.06°N."],
## BBOX[-85.06,-180,85.06,180]],
## ID["EPSG",3857]]
3. Create two plots. In the first plot highlight the tract with the highest concentration of your selected race/eth. In the second plot highlight the tract with the highest median household income? (5 points)
# plot tract with highest % white population over the full Balt City map
md_race_income_2019_webmerc %>%
filter(whitepct > 80) %>%
filter(whitepct== max(whitepct)) %>%
ggplot() +
geom_sf(data = md_race_income_2019_webmerc) +
geom_sf(aes(fill = whitepct), lwd = NA) +
theme_dark() +
ggtitle("Census Tract in Baltimore City with highest percentage of White Population") +
theme(plot.title = element_text(size = 11.5)) +
xlab("Longitude") +
ylab("Latitude")
# plot tract with the highest median household income over the full Balt City map
md_race_income_2019_webmerc %>%
filter(med_hh_incE > 30000) %>%
filter(med_hh_incE== max(med_hh_incE)) %>%
ggplot() +
geom_sf(data = md_race_income_2019_webmerc) +
geom_sf(aes(fill = med_hh_incE), lwd = NA) +
theme_dark() +
ggtitle("Census Tract in Baltimore City with highest Median Household Income") +
theme(plot.title = element_text(size = 11.5)) +
xlab("Longitude") +
ylab("Latitude")
4. Create a third column using the bi_class function from the tutorial. (2 points)
# create classes using call for bi_class function of dimension 3 with jenks break
bi_md_race_inc <- bi_class(md_race_income_2019_webmerc, x = whitepct, y = med_hh_incE, style = "jenks", dim = 3)
## Warning in classInt::classIntervals(bins_y, n = dim, style = "jenks"): var has
## missing values, omitted in finding classes
# use table to ensure classes were properly allocated
table(bi_md_race_inc$bi_class)
##
## 1-1 1-2 1-NA 2-1 2-2 3-2 3-3
## 82 27 1 22 34 13 20
5. Create a bivariate map with your data. (3 points)
# store plot in a new variable `map` to call later when plotting with legend
map <- ggplot() +
geom_sf(data = bi_md_race_inc, mapping = aes(fill = bi_class), color = "white", size = 0.1, show.legend = FALSE) +
bi_scale_fill(pal = "DkBlue", dim = 3) +
labs(
title = "Race and Income Baltimore City, MD",
subtitle = "Dark Blue (DkBlue) Palette"
) +
bi_theme()
# plot the new map
plot(map)
6. Use the cowplot package and ggdraw, like in the tutorial to add a legend (2 points).
# create bivaraite legend and store in new variable to plot with map
legend <- bi_legend(pal = "DkBlue",
dim = 3,
xlab = "Higher % White ",
ylab = "Higher Income ",
size = 7)
# combine map with legend into new variable
finalPlot <- ggdraw() +
draw_plot(map, 0, 0, 1, 1) +
draw_plot(legend, 0.22, .05, 0.25, 0.25)
# plot variable containing map and legend
plot(finalPlot)
7. Rinse and repeat for another county of your choosing, using a different color scheme. Be sure to use Psuedo-Mercator (3857). (5 points)
# use tidycensus to obtain tract data for total pop, non-hispanic white pop, and household median income for Calvert County
cc_race_income_2019 <- get_acs(geography = "tract",
variables = c("non_his_white" = "B03002_003", # non-hispanic white population
"total_pop" = "B03002_001", # total population
"med_hh_inc" = "B19013_001" # Median household income
),
year = 2019,
state = c(24), # Maryland
county = c(9), # Calvert County
survey = "acs5", # 5 year survey
geometry = TRUE, #to download for mapping
output = "wide")
## Getting data from the 2015-2019 5-year ACS
# create a new variable to append to our data representing percent white population per tract
cc_race_income_2019$whitepct <- (cc_race_income_2019$non_his_whiteE / cc_race_income_2019$total_popE) * 100
# Get a general idea of values for each column to narrow down filtering for max function
head(cc_race_income_2019)
## Simple feature collection with 6 features and 9 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -76.70196 ymin: 38.3751 xmax: -76.4811 ymax: 38.76928
## Geodetic CRS: NAD83
## GEOID NAME non_his_whiteE
## 1 24009860702 Census Tract 8607.02, Calvert County, Maryland 1902
## 2 24009860102 Census Tract 8601.02, Calvert County, Maryland 2672
## 3 24009860101 Census Tract 8601.01, Calvert County, Maryland 2445
## 4 24009860501 Census Tract 8605.01, Calvert County, Maryland 5308
## 5 24009860801 Census Tract 8608.01, Calvert County, Maryland 5280
## 6 24009860402 Census Tract 8604.02, Calvert County, Maryland 2700
## non_his_whiteM total_popE total_popM med_hh_incE med_hh_incM
## 1 275 3256 345 82006 15372
## 2 221 2990 238 113188 17489
## 3 263 3159 210 135375 14592
## 4 426 7020 412 149563 14119
## 5 547 6900 624 126081 17096
## 6 368 3599 428 99779 15300
## geometry whitepct
## 1 MULTIPOLYGON (((-76.60506 3... 58.41523
## 2 MULTIPOLYGON (((-76.66229 3... 89.36455
## 3 MULTIPOLYGON (((-76.70121 3... 77.39791
## 4 MULTIPOLYGON (((-76.61466 3... 75.61254
## 5 MULTIPOLYGON (((-76.60889 3... 76.52174
## 6 MULTIPOLYGON (((-76.56918 3... 75.02084
cc_race_income_2019 <- cc_race_income_2019 %>%
filter(total_popE > 0)
# filter percent white pop to determine the max value
cc_race_income_2019 %>%
filter(whitepct > 80) %>% # use values above 80% to narrow down our search
filter(whitepct== max(whitepct))
## Simple feature collection with 1 feature and 9 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -76.471 ymin: 38.36306 xmax: -76.38148 ymax: 38.45049
## Geodetic CRS: NAD83
## GEOID NAME non_his_whiteE
## 1 24009861001 Census Tract 8610.01, Calvert County, Maryland 1056
## non_his_whiteM total_popE total_popM med_hh_incE med_hh_incM
## 1 122 1148 112 144219 22239
## geometry whitepct
## 1 MULTIPOLYGON (((-76.47093 3... 91.98606
# filter medium household income to determine max value
cc_race_income_2019 %>%
filter(med_hh_incE > 30000) %>% # use values above 30000 to narrow our search
filter(med_hh_incE== max(med_hh_incE))
## Simple feature collection with 1 feature and 9 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -76.701 ymin: 38.63048 xmax: -76.59602 ymax: 38.72666
## Geodetic CRS: NAD83
## GEOID NAME non_his_whiteE
## 1 24009860200 Census Tract 8602, Calvert County, Maryland 4890
## non_his_whiteM total_popE total_popM med_hh_incE med_hh_incM
## 1 361 6138 361 155548 23827
## geometry whitepct
## 1 MULTIPOLYGON (((-76.70067 3... 79.66764
# check initial projection (should be NAD83)
st_crs(cc_race_income_2019)
## Coordinate Reference System:
## User input: NAD83
## wkt:
## GEOGCRS["NAD83",
## DATUM["North American Datum 1983",
## ELLIPSOID["GRS 1980",6378137,298.257222101,
## LENGTHUNIT["metre",1]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## CS[ellipsoidal,2],
## AXIS["latitude",north,
## ORDER[1],
## ANGLEUNIT["degree",0.0174532925199433]],
## AXIS["longitude",east,
## ORDER[2],
## ANGLEUNIT["degree",0.0174532925199433]],
## ID["EPSG",4269]]
# use st_transform to reproject
cc_race_income_2019_webmerc <- st_transform(cc_race_income_2019, crs = 3857)
# check to make sure the transform worked
st_crs(cc_race_income_2019_webmerc)
## Coordinate Reference System:
## User input: EPSG:3857
## wkt:
## PROJCRS["WGS 84 / Pseudo-Mercator",
## BASEGEOGCRS["WGS 84",
## DATUM["World Geodetic System 1984",
## ELLIPSOID["WGS 84",6378137,298.257223563,
## LENGTHUNIT["metre",1]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## ID["EPSG",4326]],
## CONVERSION["Popular Visualisation Pseudo-Mercator",
## METHOD["Popular Visualisation Pseudo Mercator",
## ID["EPSG",1024]],
## PARAMETER["Latitude of natural origin",0,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8801]],
## PARAMETER["Longitude of natural origin",0,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8802]],
## PARAMETER["False easting",0,
## LENGTHUNIT["metre",1],
## ID["EPSG",8806]],
## PARAMETER["False northing",0,
## LENGTHUNIT["metre",1],
## ID["EPSG",8807]]],
## CS[Cartesian,2],
## AXIS["easting (X)",east,
## ORDER[1],
## LENGTHUNIT["metre",1]],
## AXIS["northing (Y)",north,
## ORDER[2],
## LENGTHUNIT["metre",1]],
## USAGE[
## SCOPE["Web mapping and visualisation."],
## AREA["World between 85.06°S and 85.06°N."],
## BBOX[-85.06,-180,85.06,180]],
## ID["EPSG",3857]]
# create classes using call for bi_class function of dimension 3 with quantile break
bi_cc_race_inc <- bi_class(cc_race_income_2019_webmerc, x = whitepct, y = med_hh_incE, style = "quantile", dim = 3)
# use table to ensure classes were properly allocated
table(bi_cc_race_inc$bi_class)
##
## 1-1 1-2 1-3 2-1 2-2 2-3 3-1 3-2 3-3
## 3 2 1 1 1 4 2 3 1
# store plot in a new variable `cc_map` to call later when plotting with legend
cc_map <- ggplot() +
geom_sf(data = bi_cc_race_inc, mapping = aes(fill = bi_class), color = "white", size = 0.1, show.legend = FALSE) +
bi_scale_fill(pal = "DkViolet", dim = 3) +
labs(
title = "Race and Income Calvert County, MD",
subtitle = "Dark Blue (DkViolet) Palette"
) +
bi_theme()
# create bivaraite legend and store in new variable to plot with map
cc_legend <- bi_legend(pal = "DkViolet",
dim = 3,
xlab = "Higher % White ",
ylab = "Higher Income ",
size = 7)
# combine map with legend
cc_finalPlot <- ggdraw() +
draw_plot(cc_map, 0, 0, 1, 1) +
draw_plot(cc_legend, 0.22, .05, 0.25, 0.25)
# plot final map
plot(cc_finalPlot)
8. Write the bi_class output to a geojson file. (1 points)
# Write original object out as geojson file to open in qgis
st_write(bi_cc_race_inc, "bi_cc_race_inc.geojson", append = TRUE)
## Updating layer `bi_cc_race_inc' to data source `bi_cc_race_inc.geojson' using driver `GeoJSON'
## Updating existing layer bi_cc_race_inc
## Writing 18 features with 10 fields and geometry type Multi Polygon.
9. Now open your geojson output and create a QGIS map of your bivariate map. Put an image of that map here. (2 points)